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A  COMPARISON  OF  TIME  SERIES  ANALYSIS 
AND  VISUAL  ANALYSIS  OF  BEH/WIORAL  DATA 

By 

George  Porter  Horne 
August  1977 

Chairman:    William  B.  Ware 

Major  Department:     Foundations  of  Education 

There  is  much  disagreement  in  education  and  related  areas  as  to  the 
appropriate  analytic  techniques  for  use  with  data  from  single  subject 
designs.     A  review  of  current  statistical  approaches  suggested  that  time 
series  analysis  may  be  the  most  appropriate  technique  in  view  of  the 
presence  in  such  data  of  both  regression  trends  and  dependence  in  the 
remaining  residuals. 

This  study  experimentally  evaluated  the  appropriateness  of  time 
series  techniques  in  combination  with  regression  analysis  for  the  detec- 
tion and  description  of  trends,  periodic  components,  and  other  dependence 
relations  in  behavioral  data.    Accuracy  of  predictions  made  with  time 
series  analysis  was  used  to  judg.^  the  adequacy  of  models  fitted  to  five 
sets  of  behavioral  data. 

Methods  are  discussed  by  which  these  predictions  made  using  time 
series  techniques  were  compared  to  predictions  gathered  from  two  groups 
of  subjects.    The  first  group  was  fourteen  experts  in  the  visual  analysis 
of  data  using  the  Standard  Behavior  Chart.    The  second  group  was  teachers 
who  use  the  Standard  Behavior  Chart  in  their  precision  teaching. 

The  analyses  performed  indicated  that  both  the  average  magnitude  of 

tlie  errors  and  the  bias  in  prediction  were  smaller  for  time  series  analysis 

predictions  than  for  predictions  by  both  groups  of  subjects. 

vii 


The  results  of  this  study  suggested  that  time  series  analysis,  in 
conjunction  with  regression  analysis,  does  provide  an  appropriate  and 
precise  set  of  techniques  for  detecting  and  describing  structure  in 
behavioral  data.    Although  restrictions  on  its  use  preclude  widespread 
application  in  field  settings,  time  series  techniques  hold  considerable 
potential  for  use  with  human  behavioral  research  in  education  and  other 
areas . 
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CHAPTER  1 
INTRODUCTION 


Single  subject  designs  have  been  of  great,  interest  in  work  with  human 
subjects  in  several  areas,  including  education.    Chassen  (1959,  1960,  1961, 
1967)  has  strongly  advocated  their  use  in  clinical  psychology  and  psychiatry. 
Also,  much  of  that  research  which  takes  the  individual  human  subject  as 
the  unit  of  experimental  analysis  has  followed  from  the  operant  animal 
research  of  B.  F.  Skinner  and  his  colleagues.    Within  education,  two 
major  areas  of  single  subject  behavioral  research  have  been  the  precision 
teaching  work  begun  by  Ogden  Lindsley  (1972)  and  Carl  Thoresen's  work  in 
counseling  (e.g.,  Thoresen  and  Anton,  1974).     In  addition,  behavior  modifi- 
cation, with  its  single  subject  orientation,  has  become  a  "household  word" 
in  educational  circles. 

However,  researchers  have  not  agreed  on  the  analytic  techniques  ap- 
propriate for  use  with  the  resulting  data,  which  are  repeated  measurements 
over  time  of  some  characteristic  of  a  single  subject.    A  number  of  alterna- 
tive analytic  techniques  are  discussed  below.    These  techniques  include 
visual  analysis  (aided  by  the  Standard  Behavior  Chart)  and  time  series 
analysis . 

Considerable  interest  has  been  focused  on  accuracy  of  prediction  as  a 
criterion  for  judging  the  effectiveness  of  analysis.     In  fact,  Koenig  con- 
siders prediction  to  be  "the  ultimate  criterion  of  effectiveness  in  any 
science,   ...     In  the  final  analysis,  a  science  of  education  will  be  only  as 
effective  as  its  predictive  capabilities  for  each  individual."    (1972,  p.l) 
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A  similar  notion  has  been  expressed  by  Mitchell  in  his  unifying 
theme  that  educational  psychology  must".  .   .develop  scientifically  sound 
methodological  approaches  adequate  to  the  task  of  making  accurate  pre- 
dications of  individual  behavior  from  the  context  of  empirically  assessed 
individual-environment  interactions"      (1969,  p. 669).    Also,  Johnston  and 
Pennypacker  (in  press)  consider  prediction  of  human  behavior  to  be  "the 
most  elegant  use  of  quantification  and  the  one  upon  which  rests  the 
validation  of  all  scientific  and  technological  activity." 

Statement  of  Purpose 
Within  this  context,  the  major  purpose  of  this  dissertation  was  to 
explore  the  adequacy  of  descriptions  of  behavioral  data  using  time  series 
analysis.    Accuracy  of  predictions  made  using  time  series  techniques  pro- 
vided the  criterion  for  judging  the  adequacy  of  the  results.    To  accomplish 
this,  the  accuracy  of  prediction  using  time  series  analysis  was  compared 
with  the  accuracy  of  prediction  of  those  who  use  the  Standard  Behavior 
Chart  and  associated  analytical  techniques.    The  design  chosen  allowed 
estimation  of  the  effect  of  a  number  of  possibly  confounding  sources  for 
the  observed  differences. 

Importance 

The  importance  of  judging  the  adequacy  of  time  series  analysis  for 
use  v.'ith  behavioral  data  lies  in  the  several  advantages  which  these  tech- 
niques offer  for  research  in  education  and  related  areas.    Time  series 
analysis  results  in  a  high  level  of  both  precision  and  standarization  of 
description,  alloiving  more  accurate  comparison.    The  resulting  parameter 
estimates  possess  known  statistical  properties  iv'hich  allow  statements  of 
confidence  as  to  the  range  of  values  that  might  be  observed  were  more  data 
collected  under  identical  circumstances.    Also,  transfer  function  modeling 
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provides  powerful  techniques  for  detecting  and  describing  the  relationship 
between  two  sets  of  behavioral  data. 

These  advantages  are  available  within  certain  restrictions  (discussed 
below)  on  the  type  of  data  for  which  time  series  analysis  is  appropriate. 
Although  attention  to  these  restrictions  is  necessary  to  avoid  abuse  of 
time  series  techniques,  the  proper  use  of  time  series  analysis  holds 
considerable  potential  for  assisting  human  behavioral  research  in 
education  and  related  fields. 

Summary  of  the  Remaining  Chapters 
The  literature  relevant  to  the  topic  of  this  dissertation  is  reviewed 
in  Cliapter  II.     Chapter  III  then  contains  details  of  the  time  series  analy- 
sis of  five  data  sets.    Discussions  are  also  included  on  the  methodology 
used  both  to  collect  data  on  predictions  from  two  groups  of  subjects  and 
to  analyze  the  data.    Results  of  the  comparison  of  subjects'  predictions 
with  predictions  made  using  time  series  analysis  are  presented  in  Chapter 
IV.    Chapter  V  concludes  the  dissertation  with  a  discussion  of  the  results. 


CHAPTER  II 
REVIEW  OF  THE  LITERATURE 

Three  areas  of  relevant  literature  were  reviewed  as  part  of  this 
study.    The  first  major  portion  of  the  chapter  reviews  alternatives  to 
both  visual  and  time  series  analysis  of  behavioral  data.     It  is  followed 
by  reviews  of  literature  on  visual  analysis  and  on  the  time  series  analy- 
sis techniques  used  for  this  research.    A  discussion  of  literature  re- 
garding prediction  of  behavior  concludes  the  chapter. 

Alternative  Techni  ques 
Statistical  inference  has  been  viewed  by  many  operant  researchers 
as  a  mixed  blessing  at  best,  and  often  as  a  curse  (see  Michael,  1974,  and 
Johnston  and  Pennypacker,  in  press) .    Their  preference  is  for  experimental 
control  of  residual  variation,  and  this  is  obviously  a  highly  desirable 
but  never  completely  attainable  goal.    The  result  is  that,  especially  in 
applied  behavioral  analysis,  the  data  and  accompanying  descriptive  sta- 
tistics sometimes  do  not  provide  effective  information  for  the  experimenter. 
Recently  suggested  solutions  to  this  problem  discussed  below  are,  however, 
unfortunate  in  view  of  the  published  work  on  time  series  analysis,  and 
these  suggestions  have  doubtless  only  increased  resistance  to  the  use  of 
inferential  statistics  with  data  from  single  subject  designs.  Those 
suggestions  considered  are  Analysis  of  Variance,  two  non-parametric  tests, 
and  regression  analysis. 

Analysis  of  Variance 

Shine  and  Bower  (1971)  and  Gentile,  Roden,  and  Klein  (1972)  have 
suggestt'-d  the  use  of  classical  Analysis  of  Variance  techniques  with  data  such 
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as  tliose  in  Appendix  A.     (Details  concerning  these  data  sets  are  given  in 
Chapter  III.    Each  set  involved  the  daily  measurement  over  a  period  of 
weeks  of  behavior  of  a  single  subject  displayed  using  the  Standard 
Behavior  Chart.)    This  suggestion  was  made  under  two  thoroughly  unaccept- 
able assumptions:     (a)  that  each  of  the  measurements  within  an  experimen- 
tal phase  represents  random  variation  around  a  stationary  mean  and  (b) 
that  the  random  fluctuations  are  independent . 

These  two  articles  have  ignored  a  half  century  of  work  with  time 
series  and  sparked  a  set  of  replies  in  the  winter  1974  issue  of  the  Journal 
of  Applied  Behavior  Analysis.     Kratochwill,  Alden,  Derauth,  Dawson,  Panicucci, 
Arnston,  McMurray,  Hempstead,  and  Levin  (1974)  have  shown  quite  clearly 
that  the  data  of  the  Gentile  et  al.   (1972)  article  were  in  fact  not  inde- 
pendent.   Thoreson  and  Elashoff  (1974)  have  pointed  out  that  to  ignore  the 
presence  of  trends  is  to  invalidate  the  statistical  test  and  have  suggested 
time  series  analysis  as  an  alternate  procedure.     Keselman  and  Leventhal's 
(1974)  modifications  of  the  analysis  are  unacceptable,  even  if  there  is 
interest  in  multiple  subject  designs.    Hartmann's  (1974)  suggested  modi- 
fications make  possible  a  more  valid  ANOVA  test,  if  the  tests  of  indepen- 
dence are  used.    However,  within  phase  independence  is  unlikely,  and  his 
suggestion  of  using  only  the  portion  of  a  phase  after  stability  is  reached 
ignores  a  great  deal  of  data  in  which  the  gi'eatest  interest  often  lies. 
Statistics  should  assist  analysis,  not  dictate  design,  and  Hartmann  has 
also  suggested  time  series  analysis  as  a  possible  alternative. 

Non-parametric  Tests 

Two  authors  have  suggested  non-parametric  tests  for  single  subject 
data.  Revusky  (1967)  has  assumed  independence  of  repeated  measurements 
and  tlie  use  of  a  multiple  baseline  design  in  terms  of  subjects,  behaviors. 
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or  occasions  in  his  suggested  alternative  to  ANOVA.    Thus,  his  technique 
is  r.nsiiited  to  a  single  set  of  time  series  data. 

In  an  even  niorc  recent  article  Edgington  (1975)  has  ignored  the 
issues  of  dependence  and  trends  and  advocated  the  disruption  of  experimental 
design  by  random  assignment  of  the  dates  of  change  in  independent  variables. 
The  data  v/ere  then  split  into  sets  in  all  other  possible  ways  and  the 
weighted  mean  differences  between  sets  computed.    The  proportion  of  al- 
ternative splits  for  which  the  weighted  mean  difference  is  greater  than 
that  for  the  observed  mean  difference  is  the  alpha  level  of  the  test. 

Regression  Analysis 

A  far  more  acceptable  approach  than  ANOVA  or  the  non-parametric  tests 
just  discussed  is  that  of  regression  model  building  as  discussed  by  Kelly, 
McNeil,  and  Newman  (1973)  and  suggested  in  simple  linear  form  by  Armenakis 
and  Peild  (1975).    This  approach  has  also  been  advocated  by  Chassan  (1967, 
p.  195)  and  Campbell  (1963)  with  some  reservations.    However,  the  suggested 
tests  of  significance  of  the  resulting  parameters  assume  independence. 
Bartlott  (Note  1)  discarded  accepting  the  significance  of  such  tests  for 
time  series  data  42  years  ago.     In  the  same  year  Aitken  (Note  2)  showed 
that  for  known  dependence  structure  in  the  residuals,  it  is  possible  to 
transform  the  data  to  obtain  an  appropriate  test.    Work  continued  steadily 
with  important  articles  by  Cochrane  and  Orcutt  (1949)  and  V/atson  (Note  5) 
which  have  stressed  the  dangers  of  testing  regression  parameters  in  the 
presence  of  an  unknown  dependence  structure  in  the  regression  residuals. 
Much  of  this  history  has  been  traced  by  R.  L.  Anderson  (1954)  without 
reaching  any  definitive  conclusions  as  to  the  appropriate  solution. 

In  addition  to  problems  associated  with  assumptions  of  independence, 
a  second  weakness  of  a  simple  regression  approach  to  testing  trends  in 
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time  series  is  the  failure  of  the  technique  to  take  account  of  seasonal 
variation  in  time  series  (for  example,  the  weekly  component  in  Appendix  A, 
figure  2) . 

Visual  Analysis 

The  end  result  of  the  problems  presented  above  is  that  most  operant 
researchers  have  relied  entirely  on  visual  analysis  of  their  data.  The 
development  of  the  Standard  Behavior  Chart  has  facilitated  this  process  in 
that  it  provides  a  standarized  display  format.    The  behavior  of  interest 
is  observed  repeatedly    (typically  daily)  and  the  measurement  of  the  char- 
acteristic of  interest  is  placed  at  the  appropriate  intersection  of  an 
equal  interval  day  line  and  an  equal  ratio  (logarithmic)  frequency  line 
(see  Appendix  A  for  five  examples  of  this  form  of  visual  display).  In- 
terest is  focused  on  changes  occurring  over  time  in  the  behavior  being 
measured. 

The  data  also  are  summarized  by  descriptive  statistics.     Linear  fits 
to  the  set  of  points  within  each  chart  phase  allow  the  calculation  of 
celeration  as  a  measure  of  the  direction  of  change  across  time  in  the 
data.    A  quarter-intersect  method  (see  Koening,  1972,  p.  13)  is  available 
where  the  slope  of  the  line  is  uncertain.     (IVhite,  Note  4,  has  suggested 
a  modification  assuring  that  one  half  of  the  points  fall  above  the  line  and 
one  half  below.) 

Celeration  is  the  equivalent  of  slope  in  linear  regression,  which  to- 
gether with  intercept  provides  a  complete  description  of  the  trend  line. 
Variation  around  the  trend  line  is  described  as  the  range  in  the  form  of 
total  bounce  (see  Pennypacker,  Koenig  and  Lindsley,  1972).  Periodicity 
is  a  characteristic  of  time  series  in  general  (Holtzman,  1963)  and  is  often 
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noted  in  behavioral  data.     Periodically  recurring  patterns  frequently  oc- 
cur in  the  form  of  weekly  cycles.    All  of  these  characteristics  of  behav- 
ioral time  series  are  used  by  visual  analysts  as  aids  in  the  prediction 
of  future  data. 

Time  Series  Analysis 

During  the  last  six  years,  statisticians  iv'orking  on  combining  regres- 
sion analysis  with  time  series  analysis  have  developed  a  sufficient  array 
of  techniqties  to  describe  behavioral  data  from  single  subject  experiments, 
and,  if  desired,  to  test  appropriate  hypotheses  regarding  the  statistical 
significance  of  regression  trends. 

The  first  step  is  the  estimation  of  any  deterministic  trends  through 
traditional  regression  techniques  involving  least  squares  fitting.  Part 
of  the  variance  of  the  resulting  residuals  may  be  due  to  periodic  compo- 
nents.   Detection  of  such  components  is  accomplished  through  two  techniques. 

Detection  and  Description  of  Periodicity 

The  first  technique  for  detecting  periodicity  involves  computing  the 

autocorrelation  function  for  the  residuals.    This  function  is  the  sequence 

of  Pj^'s  various  lags      where  Pj^  is  defined  as: 

N-k 

E    (X.-X)  (X.^^-X) 

p,   =   r;   where  X.  is  the  observation  at  time  i. 

'k  N         _  1 

I  iX.-X) 

i=l  ^ 

The  approximate  distribution  of  the  autocorrelation  function  has  been 
known  since  Bartlett's  (1946)  classic  paper.     In  addition.  Pierce  (1971a) 
has  shown  that  the  distributional  properties  are  asymptotically  independent 
of  least  squares  regression  trends.    The  autocorrelation  function  is  useful 


for  detecting  periodic  components  because  a  periodic  component  with  a  cycle 
of  k_  days  will  result  in  relatively  large  autocorrelations  at  all  lags 
i^*lc  where  i^  takes  on  integer  values. 

The  second  important  technique  for  exploring  seasonality  (periodicity) 
in  the  data  is  spectral  density  analysis.     Conceptually,  this  technique 
takes  samples  from  a  series  of  sine  waves  of  varying  frequencies  (.each  one 
appropriately  adjusted  in  both  amplitude  and  phase) .    Each  of  these  sets 
of  samples  is  taken  at  the  same  sampling  rate  as  the  time  series  data 
(typically  one  per  day),  and  each  discretely  sampled  sine  wave  is  compared 
to  the  data  of  the  time  series.     If  there  is  periodic  oscillation  in  the 
time  series,  it  will  be  displayed  as  a  peak  in  the  spectral  density  curve 
at  the  appropriate  period.    T.  W.  Anderson  (1971,  chapter  10,  section  3) 
extended  Hannan's  (1958)  earlier  work,  showing  that  estimates  of  spectral 
density  are  asymptotically  unaffected  by  using  residuals  from  a  regression 
fitted  by  least  squares.    These  estimates  of  spectral  density  are  computed 
using  the  autocorrelation  function,  and  may  contain  small  peaks  that  do  not 
represent  seasonality  in  the  data.    Agreement  betv^reen  the  two  techniques, 
however,  is  a  strong  indication  of  periodicity. 

When  seasonality  is  present  in  the  data,  time  series  analysis  provides 
three  options.    The  first  option  is  to  perform  a  sinusoidal  regression  on 
the  residuals  in  order  to  remove  (and  therefore  describe)  the  periodic 
component.    A  second  possibility  is  to  form  a  new  series  through  seasonal 
backspacing.    This  involves  subtracting  each  i^th  measurement  from  the 
(i  +  _k)th  measurement  (see  0.  D.  Anderson,  1976,  chapter  12  for  a  more  de- 
tailed discussion)  where  k^  is  the  period  of  the  oscillatory  component.  The 
lengtli  of  the  new  series  is  N-lc  and  is  used  for  further  analysis.  If 
neither  of  tliese  techniques  is  successful,  then  the  periodicity  can  be  built 
into  the  noise  model  discussed  below. 
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Building  a  Model  for  the  Residuals 

Even  in  the  situation  where  one  of  these  techniques  is  successful, 
there  will  always  remain  a  series  of  residuals  whose  characteristics  have 
not  yet  been  described.    And  these  residuals  cannot  be  assumed  to  repre- 
sent an  independent  random  sample  even  where  homogeneity  of  variance  is  a 
viable  assumption.    Time  series  analysis  makes  a  far  less  restrictive  as- 
sumption that  these  residuals  represent  a  stationary  time  series.  Stat- 
isticians define  stationary  as  meaning  that  the  estimated  residuals  can  be 
conceptualized  as  a  single  realization  of  an  underlying  stochastic  process. 
Futhermore,  this  stochastic  process  is  such  that  the  joint  distribution 
function  of  measurements  z.,  z.        z.  z        is  the  same  for  each 

starting  point  i^  given  a  value  of  m  (see  Anderson,  1976,  chapter  1). 

Recent  work  in  the  modeling    of  dependence  in  time  series  has  assumed 
that  those  dependencies  can  be  described  by  the  following  linear  model: 

\  -  '  *2^-2  ^  Vt-p  '  \  '  ^\-l  '  ^2\-2  '  Vt-q- 

2 

The  a  's  are  assumed  to  be  independent  identically  distributed  (0,o  ) 
t  3 

and  the  *  's  and  the  9,  's  are  constants.     Often  it  is  also  assumed  that 
k  k 

the  a^'s  are  normally  distributed. 

Box  and  Jenkins  (1970)  have  developed  an  iterative  cycle  of  identi- 
fication, estimation,  and  verification  using  this  model.    The  first  step 
is  identification  of  the  terms  needed  for  an  adequate  model  using  two 
series  of  statistics  computed  from  the  residuals.    The  first  is  the  sequence 
of  p,  's  for  the  various  lags  as  defined  above.    These  are  consistent  esti- 
mates  of  the  corresponding  stochastic  parameters  defined  as: 


'^k  E(z^-p  f 


p,   =   ^    where  E(z  )  =  y  for  all  t. 


From  the  p,  's  are  then  computed  estimates  of  the  partial  autocorrelation 
k 
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function  (see  Anderson,  1976,  chapter  2  for  a  definition^ .    The  two  series 
take  on  characteristic  patterns  for  the  various  values  of      and  q  which 
indicate  the  number  of  parameters  in  the  noise  model.     These  two  series  of 
statistics  allow  tentative  identification  of  the  noise  model. 

The  second  step  in  the  Box- Jenkins  iterative  cycle  involves  estima- 
tion of  the  parameters  for  the  model  identified.    This  procedure  involves 
fitting  by  the  use  of  the  least  squares  criterion,  but  is  accomplished 
through  non- linear  techniques.     It  can  involve  building  seasonality  as 
discussed  above  into  the  noise  model  where  seasonal  backspacing  and  sinu- 
soidal regression  have  failed. 

If  the  resulting  description  of  serial  dependencies  in  the  residuals 

is  appropriate,  then  the  "second-order  residuals"  a^  should  estimate  an 

independent  random  sample.     When  this  is  true,  then  a  new  series  of  auto- 

correlation  coefficients  (computed  now  on  the  a^'s)  should  all  estimate 

zero  values  for  the  underlying  stochastic  process.     Under  this  null  hypo- 

thesis,  n  E    p,     ~  y  where  s  is  the  number  of  autocorrelations  computed 

i=I  ^s-p-q 

and  £  and  q  identify  the  number  of  terms  in  the  noise  model.     If  the  com- 
puted value  is  non-significant  at  the  specified  a  level,  then  the  model 

has  been  verified  as  adequate.    Also,  under  the  same  assumption,  the  stan- 

1 

dard  deviation  of  p,   =    =  for  all  k.    This  fact  can  be  used  as  an  additional 

k  ^/n 

check  for  the  adequacy  of  the  model.     If  the  model  is  rejected,  then  the 
iterative  cycle  of  identification,  estimation,  and  verification  continues. 

The  final  step  in  this  particular  application  of  time  series  analysis 
techniques  is  to  treat  the  regression  coefficient (s) ,  the  estimate  of  the 
variance  of  the  a^ ' s  (which  is  smaller  than  the  variance  of  the  regression 
residuals  since  we  have  "explained"  part  of  the  variance  with  our  noise 
model),  and  the  estimates  of  the  parameters  of  the  noise  model  as  initial 
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estimates.     Non- linear  fitting  by  least  squares  is  then  used  to  minimize 

the  values  of  a    .     Pierce  (1971bj  has  shown  that  the  results  of  this 
a 

estimation  procedure  are  asympototically  multivariate  normal,  consistent,  and 
unbiased;     furthermore,  the  estimates  of  the  regression  parameters  are  in- 
dependent of  the  parameter  estimates  for  the  noise  model.    Also,  for  several 
simple  m.odels,  it  was  shown  that  the  asymptotic  results  are  very  well 
approximated  in  Monte  Carlo  work  (where  the  model  is  known)  with  samples 
of  size  50. 

Prediction 

Just  as  visual  analysts  (using  the  Standard  Behavior  Chart)  use  their 

descriptive  statistics  as  an  aid  in  the  prediction  of  future  data,  so  time 

series  analysts  use  the  parameter  estimates  described  above  to  draw  the 

maximum  information  for  prediction.    The  prediction  used  is  the  expected 

value  of  the  series  at  the  predicted  time  conditioned  on  the  data  available, 

2 

since  this  minimizes  E(Y-Y)   .     (For  a  series  where  the  model  is  known  rather 

than  estimated  and  does  not  involve  seasonal  backspacing,  the  resulting 

"2  2 
value  of  E(Y-'i')     is  always  between  a    and  the  variance  of  the  residuals 

sl 

from  which  the  model  is  built.)    Where  the  Box- Jenkins  model  is  built  on  a 
seasonally  backspaced  series,  predictions  of  future  values  of  the  back- 
spaced series  are  made.    These  predictions  are  then  used  to  obtain  pre- 
dictions for  future  data  in  the  original  series. 

Literature  on  Prediction 
Visual  Prediction  of  Behavioral  Data 

Koenig's  (1972)  study  involved  a  comparison  of  least  squares  and 
quarter- intersect  techniques  for  providing  linear  fits  to  log  transformed 
frequencies  (see  Koenig,  1972,  p.  13  for  an  illustration  of  the  latter 
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technique).     In  addition  to  two  goodness  of  fit  criteria,  prediction  of 
future  behavior  was  used  as  a  major  criterion  of  comparison.  However, 
since  at  that  time  it  was  not  deemed  feasible  to  predict  exact  human  fre- 
quencies, it  v/as  decided  to  compare  the  accuracy  of  envelopes  drawn  around 
least  squares  lines  to  the  accuracy  of  envelopes  drawn  around  quarter- 
intersect  lines  in  terms  of  the  ability  of  the  envelopes  to  encompass 
future  frequencies  (see  Koenig,  1972,  p.  17  for  example). 

The  conclusion  was  that  visual  analysts  using  the  quarter- intersect 
method  showed  very  similar  predictive  accuracy  to  that  of  computerized 
least  squares  linear  regression.    Also,  preliminary  indications  were  cited 
that,  "although  the  Behavior  Bank  does  not  have  enough  cases  to  computer- 
summarize  yet,  freehand  lines  of  best  fit  drawn  by  most  children  and 
teachers  appear  to  be  almost  as  accurate  as  the  quarter- intersect  line" 
(Koenig,  1972,  p.  32). 

As  a  refinement  of  Koenig 's  comparison  of  statistical  versus  visual 
techniques,  this  study  attempted  to  compare  time  series  analysis  as  de- 
scribed above  with  unrestricted  visual  analysis.   (Subjects  were  encouraged 
to  use  any  technique  normally  used  for  visual  analysis,  including  the 
quarter- intersect  method.)    Refinements  made  on  Koenig 's  (1972)  study  in- 
clude 'vhat  is  considered  to  be  a  more  sensitive  measure  of  accuracy  of 
prediction. 

Clinical  Versus  Statistical  Prediction 

Although  the  rich  history  of  clinical  versus  statistical  prediction 
is  only  tangentially  related  to  this  research,  a  review  of  the  topic  by 
Holt  (1970)  raised  two  objections  to  previous  publications  which  do  have 
some  relevance.     First,  many  of  the  studies  had  used  discriminant  function 
tecfmiques  to  maximize  discrimination  regarding  group  membership  on  the 
basis  of  prior  data.    However,  rather  than  comparing  the  success  of  using 
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the  resulting  equations  to  predict  group  membership  for  a  neiv  sample  of 
subjects  for  comparison  with  success  in  clinical  prediction,  these  pre- 
vious studies  had  often  used  the  same  group  of  subjects  from  which  the  re- 
gression weights  were  derived  (i.e.  no  cross-validation  had  been  used). 
Second,  the  "clinicians"  used  for  these  studies  often  possessed  little  or 
no  clinical  training  and/or  were  too  few  in  number  to  obtain  generalizable 
results.     Blumetti  (1972)  met  these  objections  for  a  typical  study  of  this 
type  while  being  careful  to  present  information  on  the  sample  from  which 
the  equations  were  computed  to  clinicians  prior  to  asking  them  to  predict 
for  the  cross-validation  group  of  subjects. 

An  analogy  might  be  drawn  to  the  present  study,  since  subjects  viewed 
the  complete  data  set  from  which  time  series  models  were  derived  before 
predicting  future  data.    Also,  time  series  criterion  predictions  were  made 
on  data  which  was  not  used  to  derive  the  model,  and  a  sufficient  number  of 
subjects  were  used  (many  of  them  recognized  leaders  in  visual  analysis 
using  the  Standard  Behavior  Chart)  to  allow  adequate  conclusions  to  be 
drawn . 


CHAPTER  III 
METHODOLOGY 


Several  successive  sets  of  procedures  were  used  to  both  gather  and 
analyze  the  data  for  this  study.     In  the  first  section  of  this  chapter  is 
outlined  in  detail  the  time  series  analyses  of  the  five  data  sets  used  for 
prediction  by  both  time  series  modeling  and  by  two  groups  of  subjects.  The 
first  set  ivas  designated  data  set  1  as  opposed  to  a  letter  designation  be- 
cause its  presentation  to  subjects  always  came  first.    The  remaining  data 
sets  were  presented  in  varying  sequence. 

In  the  second  section  are  presented  details  of  the  logic  involved  in 
determining  both  the  order  of  presentation  of  the  data  sets  and  other  pro- 
cedures for  gathering  subjects'  predictions  for  comparison  with  time  series 
analysis  predictions.     Subsections  outlining  the  sequence  of  procedures  used 
to  analyze  the  resulting  predictions  are  contained  in  the  last  section  of 
the  cliapter. 

Time  Series  Analysis 

Data  Set  1 

Time  series  analyses  as  described  in  Chapter  II  were  performed  on  five 
data  sets.     Data  set  1  (Appendix  A,  figure  1)  was  self-report  data  from  a 
graduate  student  in  applied  behavior  analysis  regarding  his  frequency  of 
yawning.    Analysis  of  data  set  1  began  with  fitting  a  straight  line  by 
least  squares.     Because  of  the  logarithmic  vertical  scale  of  the  Standard 
Behavior  Chart,  the  regression  was  performed  on  the  log  transform  of  the 
observed  frequencies,  with  the  resulting  residuals  plotted  and  punched  as 
well  as  listed. 
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16 

Detection  of  periodicity.    Since  the  regression  residuals  are  only 
the  starting  point  for  time  series  analysis,  a  set  of  Fortran  subroutines 
written  by  Mark  C,  K.  Yang  (personal  communication)  was  used  to  continue 
analysis.    First,  the  autocorrelation  function  was  computed  to  assist  in 
detecting  periodic  components.    The  first  16  autocorrelations  for  data  set 
1  are  listed  in  Appendix  B,  table  11.    Although  autocorrelations  at  some 
lags  were  larger  than  would  be  expected  for  a  white  noise  process  (i.e., 
p.      0  for  all  k) ,  there  was  no  evidence  of  a  series  of  large  autocorrela- 

K 

tions  p.^,   to  indicate  the  presence  of  a  cyclical  component  with  period  k. 
i*k  ~ 

As  a  second  method  for  detecting  cyclical  components,  the  spectral 
density  function  was  computed.    The  results  of  spectral  density  analysis 
for  data  set  1  are  displayed  in  Appendix  B,  figure  6.     (The  bandwidth  of 
16  refers  to  the  number  of  autocorrelation  coefficients  used  to  estimate 
spectral  density.    Density  functions  computed  at  two  other  bandwidths  were 
similar  in  shape.)     In  similar  manner  to  the  results  for  the  autocorrelation 
function,  there  were  some  peaks  in  the  spectral  density  curve,  but  none  was 
pronounced  enough  to  indicate  periodicity  in  data  set  1.    Thus,  the  combined 
information  from  inspecting  both  the  autocorrelation  function  and  the 
spectral  density  function  supported  a  decision  to  build  a  noise  model 
directly  on  the  regression  residuals,  since  the  residuals  appeared  to 
estimate  a  stationary  time  series. 

Model  building.    At  this  point  the  Box-Jenkins  iterative  cycle  of 
identification,  estimation,  and  verification  (Box  and  Jenkins,  1970)  pro- 
vided a  powerful  analytical  process  for  the  description  of  the  dependence 
structuT-e  of  the  estimated  residuals.    The  first  step  was  to  identify  the 
number  of  terms  needed  for  an  adequate  model  using  the  autocorrelation 

function  given  in  table  11,  Appendix  B.     From  that  table  it  can  be  seen 
that  the  larger  autocorrelations  are  those  at  lags  1,  3,  10,  11,  13,  and  15. 
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Other  autocorrelations,  while  non-zero,  are  not  sufficiently  large  to 
indicate  non-zero  values  for  the  Pj.'s  of  the  presumed  underlying 
stochastic  process. 

For  each  potential  model  it  is  possible  to  compute  the  theoretical 
values  of  the  p.  's.    The  following  model  was  chosen  because  the  non-zero 

K 

theoretical  values  of  Pj^  for  k      1,  3,  10,  11,  and  13  for  this  model  pro- 
vided a  parsimonious  explanation  of  the  observed  autocorrelation  function: 

'^t  -  *10  VlO  '  \  '  '  ^3\-13  a^  ~  IID  iO,oh 

t  cl 

CIn  this  equation  and  those  that  follow  the  t  subscript  numbers  the  days 
consecutively  from  day  1  through  day  n. ) 

Estimation  of  the  model  parameters  then  proceeded  with  non-linear 

/V  /V  ^ 

least  squares  estimates  of  4)^^  =  .08952,  6^  =  .32014,  and  6^^=  -.58211. 
Thus,  the  variation  from  the  trend  line  on  a  given  day  was  not  independent 
of  the  residuals  which  precede  it.     Specifically,  the  variation  from  the 
trend  line  can  in  part  be  predicted  from  residuals  occurring  1,  10,  and  13 
days  earlier.     (Having  detected  and  described  these  dependencies,  the  re- 
searcher may  then  look  for  possible  explanations.) 

The  final  step  in  building  a  model  for  the  residuals  was  verification 
of  the  adequacy  of  the  model.     For  this  purpose  the  sequence  of  a^'s  was 
estimated  by  computing: 

a^  =  N    -  .08952        , „  -  .32014  a^  ,  +  .58211  a^  ,„ 
t        t  t-10  t-1  t-13 

It  should  be  remembered  that  if  the  model  is  appropriate,  then  these  "second 

order  residuals"  should  estimate  an  independent  random  sample.    Under  this 

16^22  ^ 
null  hypothesis  64    ^  P^    ~  X  (13),  where  the  P,  's  are  now  computed  on  the 

k=l  ^  ^ 

a^'s  rather  than  the  regression  residuals  (64  equals  the  number  of  data 
points  used,  16  is  the  number  of  autocorrelations  computed,  and  the  13 
degrees  of  freedom  =  16-3,  since  (j)^^,  0^,  and  0^^  were  the  three  parameters 
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estimated).    The  observed  value  was  x  (1-^)      5.566,  £  >  .95,  which  gave 
support  for  the  adequacy  of  the  model.     (The  asymptotic  derivation  of  the 

distribution  for  this  statistic  makes  specification  of  power  for  the 
test  difficult,  but  in  general  a  larger  a  implies  a  smaller  type  II  error 
rate.     Since  the  decision  was  to  accept  the  null  hypothesis,  the  largest 
possible  a  was  chosen  for  each  of  the  models  discussed.)    Additional  sup- 
port for  the  adequacy  of  the  model  resulted  from  computing  the  standard 

deviation  of  p    under  the  hypothesis  of  correct  model  identification.  All 

K 

autocorrelations  computed  on  the  a^'s  were  <  .14  compared  to  a  standard 
deviation  of  .125. 

At  this  point,  separately  obtained  least  squares  estimates  of  regres- 
sion trends,  noise  model  parameters,  and  variance  of  the  a^'s  were  avail- 
able.    The  final  step  in  model  building  was  joint  non-linear  least  squares 
estimation  of  variance,  regression,  and  noise  model  parameters,  which  re- 
sulted in  the  following  description  of  the  structure  of  the  data  set: 

log        =  .076  log  Y^_^Q  -  4. 8(1-. 076)  +  .01   (t)  -   .0076  (t-10)  + 
a^  +  .275  a^_^  -  •62a^_^3  a^  ~  IID  (0,  .64) 

Note  that  the  resulting  description,  by  taking  into  account  the  structure 
in  the  data  set,  has  reduced  the  "unexplained"  variance  from  .924  for  the 
residuals  from  simple  linear  regression  to  .64  for  residuals  in  the  full 
model . 

Prediction.  At  this  point,  the  model  that  had  been  developed  was  used 
to  predict  the  level  of  yawning  for  the  next  ten  days.     (These  additional 
data  were  recorded  prior  to  time  series  analysis,  but  were  not  used  to 
develop  the  model.)    The  results  of  this  prediction  are  compared  in 
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Chapter  IV  with  the  level  of  behavior  that  was  actually  recorded  for 
tlie  next  ten  days. 

Data  Set  A 

Data  set  A  (see  Appendix  A,  figure  2)  is  actually  another  phase  of 
the  data  from  which  data  set  1  was  drawn  (i.e.,  the  same  behavior  of  the 
same  subject  was  measured,  but  for  a  different  period  of  time  and  under 
differing  environmental  conditions).     In  this  case  66  days  of  recorded 
data  were  available  and  60  days  were  used  to  build  the  model,  leaving  six 
days  to  be  predicted.    Analysis  began  with  fitting  a  least  squares  straight 
line  through  the  logarithms  of  the  frequencies. 

Detection  of  periodicity.     The  autocorrelation  function  for  the 
regression  residuals  (Appendix  B,  table  12)  indicated  a  possible  seven 
day  cyclical  component  due  to  the  relatively  greater  size  of  the  lag  7 
and  lag  14  autocorrelations.    This  result  of  inspecting  the  autocorrelation 
function  confirmed  the  visual  impression  that  there  might  be  a  weekly  pat- 
tern in  the  charted  behavior.    Additional  supporting  evidence  carae  from 
the  spectral  density  function  (Appendix  B,  figure  7),  since  the  highest 
peak  occurred  at  a  period  of  seven  days  (lower  peaks  at  other  periods  are 
not  supported  by  the  autocorrelation  function  and  hence  are  probably  due 
to      sampling  error). 

Model  building  and  prediction.     Given  that  a  seven  day  cycle  had  been 
detected,  the  next  step  was  to  explore  alternatives  for  accounting  for  it 
in  the  analysis.     In  this  particular  instance  the  technique  of  seasonal 
backspacing  with  a  lag  of  seven  was  not  successful,  since  the  autocorrela- 
tion function  computed  on  the  new  series  (of  59  points  with  the  first  = 
day  8  -  day  1,  etc.)  still  contained  a  large  correlation    coefficient  at 
lag  seven. 
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Sinusoidal  regression  likewise  failed  to  remove  the  presence  of 
large  autocorrelation  coefficients  at  high  lags.     (Least  squares  regression 
was  used  to  estimate  both  the  amplitude  and  the  phase  which  minimized  the 
error  sum  of  squares  for  a  period  of  seven  days.) 

The  third  option  of  building  the  large  coefficients  at  high  lags  into 
a  Box-Jenkins  model  for  the  regression  residuals  was  successful.  This 
process  began  with  the  identification  of  the  following  model: 

2 

N    -  (j)_N.   V  +  a  a  ~  I  ID  (0,  a  J 

t/t-/t  t  d 

This  model  assumes  that  autocorrelations  at  all  lags  other  than  seven  and 

multiples  of  seven  differ  from  zero  by  sampling  error,  and  assumes  non-zero 

values  for  p^,  p^^,  etc. 

Estimation  of  the  value  of  (j)^  resulted  in  a  value  of  cf)^  =  .53.  Thus, 

the  variation  from  the  regression  line  for  a  given  day  was  not  independent 

of  the  residuals  that  preceded  it,  but  rather  this  variation  dependedon  the 

residual  that  occurred  one  week  earlier. 

Verification  of  the  adequacy  of  the  model  came  from  acceptance  that 

2 

the  a^'s  are  independent  through  a  value  of  x  (13)  =  11.62,  p  >  .50.  Final 
non- linear  least  squares  estimation  of  the  variance  of  the  a^'s,  the  linear 
regression  trend,  and  the  noise  model  parameter  resulted  in  the  following 
model : 

log        =  .4774  log  Y^_^  -  1.67  -  .01954  (t)  +  .00933  (t-7)  +  a^ 

a^  ~  IID  (0,  .45) 

This  model  was  then  used  to  predict  the  level  of  yarning  for  the  next  six 
days . 

Data  Set  B 

For  data  set  B  (see  Appendix  A,  figure  3)  65  of  70  days  of  self-report 
data  on  cigarette  smoking  ivere  used  to  build  a  model.     The  first  step  was 
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again  the  least  squares  regression  of  a  straight  line  through  the  data. 
In  this  case,  however,  visual  inspection  indicated  that  linear  regression 
through  the  original  (rather  than  the  log  transformed)  data  was  more  appro- 
priate because  the  line  more  closely  fitted  the  trend  in  the  data.  Also, 
a  plot  of  the  original  data  showed  greater  homogeneity  of  variance  than 
a  similar  plot  of  the  log  transformed  data. 

Detection  and  description  of  periodicity.    The  autocorrelation  function 
for  the  regression  residuals  (Appendix  B,  table  13j  contained  several 
autocorrelations,  with  those  at  lags  5,  10,  and  15  indicating  the  presence 
of  a  five  day  periodic  component.     (Larger  autocorrelations  at  lags  2  and 
3  did  not  continue  with  the  expected  pattern  of  4,  6,  8  ...  and  6,  9,  12  ... 
respectively.)    The  spectral  density  function  (Appendix  B,  figure  8)  also 
had  a  broad  peak  in  the  area  of  a  five  day  period,  confirming  the  results 
of  inspecting  the  autocorrelation  function. 

Seasonal  backspacing  at  lag  5  failed  to  remove  the  presence  of  large 
autocorrelations  at  lags  5  and  10.    However,  all  other  autocorrelations 
computed  on  the  differenced  series  of  regression  residuals  were  of  magnitude 
<  .15,  justifying  the  assumption  that  they  estimated  values  of  pj,  =  0  for 
the  underlying  stociiastic  process. 

Model  building  and  prediction.    These  results  made  possible  the  iden- 
tification of  a  model  for  the  differenced  series,  with  the  final  estimation 
resulting  in  the  following  model: 

=        ^  +  .00074  -   .00002  (t)  +  a^  -  .55663a^_^  -  .27427a^_^Q 

a^  ~  IID  (0,  .00003) 
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To  verify  the  accuracy  of  this  model,  the  autocorrelation  function  for  the 
a.j.'s  \ias  computed  with  the  result  that  all  Pj,  were  of  magnitude  <  .13 
compared  to  a  standard  deviation  of  .129.    The       test  also  confirmed  the 
adequacy  of  the  model,  X^(12)  =  5.595,  £  >  .90.    Note  also  that  the  "un- 
explained" variance  of  .0054  for  the  regression  residuals  has  been  reduced 
to  .00003.    The  results  of  using  this  model  to  predict  days  65  to  70  are 
discussed  in  Chapter  IV. 

Data  Set  C 

Data  set  C  (see  Appendix  A,  figure  4)  is  a  duration  measure  for  self- 
report  data  of  time  spent  in  recreational  activities.    Analysis  began  with 
a  linear  regression  through  the  log  transform  of  each  day's  datum,  which 
was  computed  as  the  inverse  of  the  number  of  minutes  spent  in  recreation 
(for  charting  conventions  with  duration  measures,  see  Pennypacker  et  al., 
1972). 

Detection  of  periodicity.     Inspection  of  the  autocorrelation  function 
(Appendix  B,  table  14)  indicated  extensive  dependencies  in  the  data  set, 
with  Py  largest  for  k      7  and  14.     Presence  of  a  weekly  cycle  was  confirmed 
by  the  spectral  density  function  (Appendix  B,  figure  9),  which  exhibited 
a  strong  peak  at  a  period  of  seven  days.    Also,  the  low  level  of  the  spec- 
tral density  function  at  a  period  of  five  days  discouraged  the  interpreta- 
tion of  sizeable  autocorrelations  at  lags  5  and  10  as  indicating  periodi- 
city,  as  did  the  small  size  of  P-^^- 

Model  building  and  prediction.     Seasonal  backspacing  at  lag  7  failed 
to  remove  the  pesence  of  all  large  autocorrelations.     It  did,  however, 
enable  the  identification  of  a  parsimonious  model  for  the  backspaced 
series  of  residuals. 
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2. 


Estimation  of  the  model  parameters  was  followed  by  computation  of 

the  autocorrelation  function  for  the  a^'s.    All  autocorrelations  were  of 

2 

magnitude  <  .17  compared  to  a  standard  deviation  of  .126,  and  the  x 

2 

test  resulted  in  x  (1^)  =  8.82,  £  >  .80.    As  a  result,  the  model  was  ac- 
cepted and  the  final  joint  estimates  of  variance,  regression,  and  Box- 
Jenkins  parameters  were  used  to  predict  the  next  week's  data  from  the 
following  complete  model: 

log        =  .162  log  Y^_^  +  log  Y^_7  -  .162  log  Y^_g  +   (1-.162)   (.017)  - 

.0006  (t)  +  .0006  (.162)   (t- 1)  +        -  .38  a^_^  -  .47  a^_^ 
a^  ~  IID  (0,  .04426) 

Again,  the  "unexplained"  variance  has  been  reduced,  in  this  case  from 
.271  to  .04426. 


Data  Set  D 

The  final  data  set  used  in  this  study  (see  Appendix  A,  figure  5)  con- 
sisted of  self-report  data  on  a  general  measure  of  gross  motor  activity, 
recorded  using  digital  pedometers  worn  by  VA  patients  throughout  each  day, 
and  known  as  the  movement  index.     (See  Goldstein,  Stein,  Sroolen,  and  Perlini, 
1976,  for  a  detailed  description  of  the  source  and  reliability  of  data  sets 
B,  C,  and  D.)    Residuals  from  a  linear  regression  on  the  original  data  were 
chosen  as  more  appropriate  for  modeling  than  those  from  regression  with 
the  log  transformed  data. 

Detection  and  description  of  periodicity.    As  for  chart  C,  the  auto- 
correlation function  (Appendix  B,  table  15)  indicated  a  weekly  period  in 
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the  data  with  p    large  for  K  =  7  and  14,     A  second  pattern  indicated  by 

K 

autocorrelations  at  lags  2,  4,  and  6  did  not  maintain  the  expected  pattern 
at  additional  even  numbered  lags,  and  was  not  supported  by  evidence  from 
the  spectral  density  function  (Appendix  B,  figure  10) .    The  major  peak  in 
the  spectral  density  function  occurred  at  a  period  of  3.5  days,  which 
would  coincide  with  the  large  lag  7  and  14  autocorrelations. 

Seasonal  backspacing  at  lag  7  suggested  a  noise  model,  but  an  attempt 
to  estimate  the  resulting  parameters  failed  to  indicate  adequacy  of  the 
model.    However,  sinusoidal  regression  with  a  period  of  3.5  days  was 
successful,  since  the  magnitude  of  pj^  was  <  .16  for  all  k  (compared  to  a 
standard  deviation  of  .12).     Because  of  this,  it  was  unnecessary  to  fit  a 
noise  model  to  the  "second  order"  residuals  from  sinusoidal  regression. 

Final  estimation  and  prediction.    The  final  joint  estimation  of 
linear  regression  and  sinusoidal  trends  made  use  of  the  trigonometric 
identity  sin  (a  +  b)  =  sin  a  sin  b  +  cos  a  cos  b  to  allow  indirect  least 
squares  estimation  of  the  phase  of  the  sine  wave  as  well  as  its  amplitude 
and  period.    The  resulting  model  reduced  "unexplained"  variance  from  .004 
to  .0001,  and  was  used  to  predict  a  week  of  additional  data. 

=  .01379  -  .00003  (t)  +  .00055  sin  (1.79417  *  t)  -  .00258 
cos  (1.79417  *  t)  +  a^  a^  ~  (0,  .00001) 

Gathering  Comparison  Data 
For  the  five  data  sets  described  above,  predictions  were  made  by  two 
groups  of  subjects  —  (1)  fourteen  subjects  were  classified  as  experts  in 
the  visual  analysis  of  data  using  the  Standard  Behavior  Chart  and  (2)  eigh- 
teen were  teachers  who  use  the  Standard  Behavior  Chart  in  their  precision 
teaching. 
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The  Pilot  Study 

Prior  to  the  collection  of  these  data,  a  pilot  study  was  completed 
using  three  charts.    These  three  charts  were  presented  in  the  same  order 
tc  each  of  six  graduate  students  in  applied  behavior  analysis  at  the 
University  of  Florida.    All  were  experienced  in  the  use  of  the  Standard 
Behavior  Chart.    Results  were  not  conclusive,  but  tended  to  indicate 
that  time  series  analysis  predicted  more  accurately  than  did  subjects 
using  the  Standard  Behavior  Chart.    Also,  since  the  same  order  of  pre- 
sentation of  a  smaller  number  of  data  sets  was  used  for  all  subjects, 
questions  about  the  influence  of  position  in  the  sequence  on  prediction 
were  unanswerable  from  the  pilot  study. 

The  Experimental  Task 

For  the  main  study  the  full  set  of  five  charts  was  presented  to  sub- 
jects.    Data  set  1  was  always  presented  first,  with  the  following  delib- 
erately open  ended  instructions  designed  to  explore  ways  in  which  people 
use  data  to  make  predictions: 

Please  examine  data  set  #  1  and  place  dots  on  the  Standard 
Behavior  Chart  at  the  level  of  behavior  you  predict  for  the  next 
10  days.    Then  explain  in  as  much  detail  as  possible  in  the  space 
below  how  you  arrived  at  this  prediction.     Please  include  a  list 
of  which  previous  days  are  most  useful  in  predicting  for  any 
given  day  (e.g.  the  seventh  preceeding  day). 

Order  of  presentation  for  the  four  remaining  data  sets  varied  in  the 
pattern  described  below.    A  standard  set  of  instructions  was  used  for  each 
of  these  four  charts  except  for  variation  in  the  number  of  days  subjects 
were  requested  to  predict.    The  instruction  were 

(1)  Fit  a  celeration  (learning)  line  or  other  predictive  function 
through  the  data. 

(2)  State  in  the  blank  space  provided  here  the  number  of  days  in  the 
period  (i.e.,  in  a  complete  cycle)  of  any  recurring  pattern  in 
the  data. 
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(3)    Predict  the  level  of  behavior  for  the  next  x  days  by  placing 
dots  on  the  chart. 

Description  of  Subj ects 

Subjects  were  sought  from  two  groups  of  people.    One  group  consisted 
of  leaders  in  the  field  of  precision  teaching,  many  with  publications  and 
national  reputations.    The  friendly  cooperation  received  was  much  appreci- 
ated with  participation  obtained  from  a  total  of  14  experts  in  Tilton,  New 
Hampshire;  Kansas  City,  Missouri;  Lawrence,  Kansas;  Gainesville  and  Panama 
City,  Florida;  and  Toronto,  Canada. 

The  second  group  of  subjects  consisted  o£  18  classroom  te.'ichers  who 
use  precision  teaching  techniques,  including  visual  analysis  with  the 
Standard  Behavior  Chart.    Again,  cooperation  ivas  almost  unanimous  among 
those  ashed  to  participate,  with  groups  of  teachers  from  Shawnee  Mission, 
Kansas,  and  Panama  City,  Florida,  analyzing  the  five  data  sets. 

The  Experimental  Design 

Within  each  of  the  two  groups,  sequences  of  charts  were  randomly 
assigned  from  among  12  of  the  24  possible  sequences.    Those  sequences 
chosen  are  listed  below  and  fulfill  the  requirements  for  strict  balance 
(i.e.,  each  chart  follows  each  other  chart  an  equal  number  of  times  for 
each  ordinal  position  in  the  sequence) .     Balance  in  the  sequencing  was 
desired  to  allow  estimation  of  (1)  the  effects  of  ordinal  position  and 
(2)  residual  effects  from  the  previous  chart  (see  Cochran  and  Cox,  1950, 
section  4. 61a) . 


1 

order  of  presentation 

first  A 
second  B 
third  C 
fourth  D 


subj  ect 

# 

2 

3 

4 

5 

6 

B 

C 

D 

A 

B 

A 

D 

C 

C 

D 

D 

A 
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Analyzing  the  Data 
Before  any  averaging  or  other  summarizing  of  data  took  place,  the 
predictions  for  each  person  for  each  day  were  plotted  to  produce  a  histo- 
gram of  predictions  (Appendix  C) .    The  vertical  scale  for  the  histograms 
covers  the  same  range  as  the  Standard  Behavior  Chart,  allowing  direct 
comparison.     Each  log  cycle,  however,  is  broken  into  10  equal  intervals, 
with  each  subject's  predictions  plotted  to  the  nearest  tenth  for  the 
logjQ  (predicted  value).     When  the  results  were  compared  with  the  datum 
which  actually  occurred  for  each  day,  there  seemed  to  be  no  relationship 
between  the  size  of  prediction  errors  and  the  number  of  previous  days 
predicted.    It  was,  therefore,  deemed  appropriate  to  summarize  the  errors 
in  prediction  for  each  person  for  each  chart,  and  also  to  summarize  errors 
in  prediction  for  time  series  analysis  for  each  chart. 

Data  Reduction 

Data  reduction  for  this  second  level  of  comparison  was  accomplished 
in  four  ways.     Primary  interest  was  focused  on  computing  the  average  magni- 
tude of  tlie  error.     Two  such  summary  statistics  were  computed:     (1)  the 
traditional  /mean  square  error  was  computed  as  /E  (predicted  -  actual)^  /  N 
and  (2)  the  average  absolute  size  of  the  error  was  computed  for  the  log 
transform.ation  of  the  frequencies. 

It  was  felt  that  this  second  measure  might  be  more  appropriate  for 
three  reasons.     First,  since  vertical  distance  on  the  Standard  Behavior 
Chart  is  logarithmic,  this  statistic  would  more  accurately  represent  the 
distances  of  predictions  from  actual  data.    A  second  related  advantage 
would  be  the  comparability  of  this  summary  statistic  for  data  sets  occurring 
in  different  parts  of  the  vertical  scale.    Third,  when  exponentiated,  the 
resulting  statistic  is  interpretable  as  representing  the  average  percent 


28 

error  (Pennypacker ,  personal  communication).    For  this  last  reason,  the 
average  of  the  logged  frequencies  was  transformed  hack  to  the  original 
scale  by  exponentiation  with  the  complete  computing  formula  being: 

exp  (Sjlog  (predicted)  -  log  (actual)]  /N) • 
There  was  also  a  secondary  interest  in  the  amount  of  bias  in  predic- 
tion,    (i.e..  Does  a  person  tend  to  place  dots  always  above,  always  be- 
low, or  evenly  spread  around  the  actually  occurring  data?)    Again,  two 
summary  statistics  were  computed  for  each  person  for  each  chart:     (1)  bias 
in  the  predictions  computed  as  Z  (predicted  -  actual)  /  N  and  (2)  bias  in 
the  log  transformed  data  measured  as  exp  (ZClog  (predicted)  -  log  (actual)]  /N) . 

A  comparison  of  the  results  for  log  transformed  data  with  the  original 
data  indicated  that,  although  the  relation  was  obviously  not  linear,  es- 
sentially the  same  information  was  contained  in  both  sets  of  summary  sta- 
tistics.    For  this  reason,  and  because  of  the  additional  advantages  cited 
above,  further  analysis  of  the  summary  statistics  used  the  two  statistics 
computed  for  the  log  transformed  data. 

Analysis  of  Variance 

Before  further  aggregation  of  the  data  on  subjects'  predictions  for 
the  purpose  of  comparison  with  the  results  of  time  series  analysis,  de- 
cisions had  to  be  made  both  as  to  the  importance  of  the  order  of  presenta- 
tion of  the  charts  and  as  to  whether  it  would  be  appropriate  to  average 
across  subjects  (and  groups).     Answers  to  these  questions  were  sought  using 
the  following  model  for  Analysis  of  Variance  with  data  sets  A,  B,  C,  and 
D  for  all  32  subjects: 
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Y..,,     =y+c.  +  s.  +p+g+r    +  £..,,      c..,,    ~  NIDfO.c  ) 

where  u  =  the  grand  mean 

c.  =  the  linear  effect  of  chart .,  i  =  1,2,3,4 

s.  =  the  linear  effect  of  subject.,  j  =  1,2,..., 32 

=  the  linear  effect  of  ordinal-'po.sitionj^,  k  =  1,2,3,4 
=  the  linear  effect  of  group  membership^,  1  -  1,2 
r    -  the  linear  effect  of  the  chart  ivhich  preceeded 
the  one  for  which  error  is  being  measured, 
m  =  1,2,3,4 

Analysis  was  accomplished  using  effect  coding  for  multiple  regres- 
sion.   Each  of  the  effects  was  adjusted  for  all  other  effects  because  the 
availability  of  data  for  14  experts  and  18  teachers  resulted  in  partially 
unbalanced  data. 

A  decision  was  also  made  to  consider  all  factors  fixed  in  view  of  the 
procedures  used  to  obtain  both  charts  and  subjects.    UTiile  each  set  was, 
in  some  sense,  a  sample  from  a  larger  population  of  interest,  the  usual 
assumptions  of  normality  and  random  sampling  from  an  infinite  population 
were  not  judged  appropriate  for  the  present  situation.    Also,  since  the 
analysis  was  not  designed  to  answer  the  major  experimental  question,  but 
only  to  guide  decisions  as  to  the  aggregation  of  these  particular  data,  a 
fixed  effects  model  was  the  most  appropriate.    The  only  generalization  made 
was  to  data  set  1,  and  then  only  because  gathering  of  balanced  data  for  the 
full  five  data  sets  would  have  involved  120  subjects  in  each  group. 

The  "error"  term  in  the  linear  model,  then,  is  properly  interpreted 
as  representing  unexplained  variability.    The  resulting  V_  ratios  are  formed 
under  the  assumption  that  there  is  no  interaction  among  the  main  effects, 
hence  all  interactions  are  confounded  with  main  effects. 

Results  of  this  analysis  (presented  below)  indicated  the  appropriate- 
ness of  averaging  across  all  subjects  on  each  data  set  for  both  measures 
computed  on  the  log  transformed  data  (i.e.,  the  measure  of  average  error 
and  the  measure  of  bias).    This  averaging  was  accomplished  using  the  arith- 


30 


metic  average  of  tlie  unexponentiated  measures,  which  is  equivalent  to  the 
geometric  mean  of  the  exponentiated  summary  statistics. 

2 

Hotelling's  T 

One  immediate  consequence  of  the  decision  to  take  an  average  across 

subjects  was  to  question  the  representativeness  of  the  resulting  average. 

A  reinspection  of  the  original  plotting  of  predictions  (Appendix  C)  showed 

a  moundshaped  distribution  for  each  day's  predictions.    Furthermore,  since 

the  distributions  were  so  similar  for  the  two  groups,  and  especially  in 

view  of  the  results  of  the  Analysis  of  Variance,  the  decision  was  made  to 

pool  data  for  the  two  groups.     Interest  was  then  focused  on  comparing  the 

average  of  predictions  for  each  day  with  two  criteria  -  (1)  the  actual 

datum  and  (2)  the  prediction  made  for  that  day  using  time  series  analysis. 

(Values  for  these  two  criteria  are  represented  in  Appendix  C  by  the  tips 

of  the  arrows  to  the  right  of  each  day's  histogram.) 

Since  assumptions  of  normality  were  justifiable  due  to  the  moundshaped 

2 

distributions  of  subjects'  predictions,  Hotelling's  T    statistic  was  chosen 
to  test  the  mean  of  the  errors  in  prediction  for  each  day  in  relation  to 
the  variability  around  that  mean.     The  logic  of  the  test  was  to  consider 
each  day's  px^edictions  as  a  random  sample  from  a  normal  population  distri- 
bution.    One  would  not  expect  a    sample  mean  to  exactly  equal  the  popula- 
tion mean;  however,  under  the  null  hypothesis  that  the  population  mean  is 
equal  to  some  criterion,  the  familiar  t^  test  evaluates  the  discrepency  be- 
tween the  sample  mean  and  the  criterion  in  terms  of  the  variability  in  the 

sample.    Given  several  days'  predictions,  with  each  day  considered  as  a 

2 

separate  dependent  variable,  Hotelling's  T    statistic  performs  the  multi- 
variate analog  of  the  univariate  i^  test  by  comparing  the  vector  of  predic- 
tion means  against  the  vector  of  criterion  values  (Tatsuoka,  1971,  pp.  76-79). 
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The  Hotel ling's  T    test  was  deemed  especially  appropriate  since  it 
utilized  all  of  the  data  ratlier  than  summary  statistics  only,  and  re- 
quired no  assumptions  as  to  the  pattern  of  correlation  or  homogeneity  of 
variance  between  the  predictions  for  any  one  day  when  compared  to  the 
predictions  for  any  other  day.    Two  analyses  were  performed  for  each 
chart:     (1)  comparing  the  log  transformed  subjects'  predictions  for  each 
chart  with  the  criterion  vector  of  log  transformed  actual  occurrences 
and  (2)  comparing  the  log  transformed  subjects'  predictions  with  the  log 
transformed  criterion  vector  of  predictions  made  using  time  series 
analysis . 


CHAPTER  IV 
RESULTS 

Procedures  used  were,  as  outlined  in  Chapter  III,  the  reduction  of 

raw  data  to  summary  measures  of  error  and  bias.  Analysis  of  Variance  for 

2 

the  resulting  summaries  of  subjects'  predictions,  and  Hotellmg's  T  test 
to  compare  subjects'  predictions  with  both  the  actual  data  and  with  time 
series  predictions.    Results  are  presented  below  for  these  steps  in  the 
analysis  of  predictions  made  using  time  series  techniques  and  in  the 
analysis  of  predictions  made  by  both  groups  of  visual  analysts. 

Reduction  of  Data 

The  results  of  initial  data  reduction  for  each  of  the  five  data  sets 
are  presented  in  Tables  1-5.    As  a  reminder,  the  computing  formula  for 
exp  (mean  absolute  error)  =  expCEl  log  (predicted)  -  log  (actual)  1/N]. 
The  result  is  interpretable  as  a  measure  of  the  average  percentage  error 
across  predictions  for  all  days.    A  result  of  1.0  indicates  no  errors  in 
prediction.    A  result  of  1.12  indicates  an  average  error  of  12%  of  the 
actual  value  of  the  datum.    Interpretation  is  perhaps  also  aided  by  con- 
sidering the  given  value  as  the  geometric  mean  obtained  when  the  ratio 

max  (predicted,  actual)     .  ,    ,  „  i*-  i-       ^  i, 

 ^r..  ..  — -J-    IS  computed  as  a  frequency  multiplier  for  each 

mm  (predicted,  actual)  ^  i       j  r 

day's  prediction,  and  the  magnitudes  of  the  frequency  multipliers  are  then 
averaged  (see  Pennypacker  et  al.,  1972,  p.  64).    Tlie  asterisks  in  Table  1 
indicate  that  6  of  14  experts  and  8  of  18  teachers  made  smaller  errors 
in  prediction  on  the  average  than  did  time  series  analysis  for  data  set  1. 

A  similar  interpretation  is  available  for  exp  (bias)  =  exp  (EClog 
(predicted)  -  log  (actual) ]/N) .    The  presence  of  the  x  or  f  sign  in 
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Table  1 

Prediction  Svimmaries  for  Data  Set  1  Using  Log  Transformed  Data 


exp  (mean  absolute  error)  exp  (bias) 


Time  Series 

2.276410 

X  1.120382 

Experts 

subiect  1 

2.146862 

* 

X  1.345521 

2 

2.555245 

X  1.072638 

3 

2.935768 

r  1.719589 

4 

2.6395S9 

i  1.084176 

* 

6 

2.503963 

X  1.412604 

7 

2.867315 

X  1.164504 

8 

2.011065 

* 

f  1.235457 

9 

3.124355 

f  1.146944 

10 

2.284037 

X  1.493364 

11 

2.217852 

* 

X  1.435608 

12 

3.384237 

X  2.664478 

13 

2.217669 

* 

X  1.260059 

14 

1.719402 

* 

^  1.195161 

Teachers 

subject  1 

2.408623 

X  1.200235 

2 

2.211190 

* 

X  1.767710 

3 

2.556755 

T  1.107828 

4 

2.175099 

* 

X  1.653617 

5 

1.911114 

* 

X  1.728397 

6 

2.985281 

X  1.331434 

7 

2.327653 

X  1.522552 

8 

2.403597 

X  1.226340 

9 

2.055245 

* 

X  1.519078 

10 

2.380712 

f  1.004441 

11 

2.818843 

i  1.501328 

12 

1.841117 

* 

X  1.273404 

13 

2.848591 

X  2.571939 

14 

2.233241 

* 

X  1.428658 

15 

1.926377 

* 

-  1.175775 

16 

2.426631 

X  1.600980 

17 

2.130599 

* 

f  1.158932 

18 

3.233157 

-  2.981320 
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froiit  of  a  tabled  value  indicates  that  the  subject  was  either  generally 
above  (x)  or  below  (7)  the  actual  day's  datum.    The  percentage  interpre- 
tation holds,  and  as  expected,  the  signed  average  for  bias  in  column  2 
is  smaller  than  the  unsigned  average  for  error  in  column  1.    Again,  the 
tabled  value  represents  the  geometric  mean  of  signed  frequency  multipliers 
obtainable  for  each  day.    The  results  presented  in  Table  1  indicate 
that  2  of  14  experts  and  2  of  18  teachers  were  less  biased  than  the 
slightly  over  12%  bias  of  time  series  analysis  predictions;  the  latter 
tend  to  predict  frequencies  greater  than  those  wliich  actually  occurred. 

The  results  for  data  set  A  are  presented  in  Table  2.     In  this  table, 
the  asterisks  indicate  that  4  of  14  experts  and  none  of  the  17  teachers 
who  completed  chart  A  made  smaller  errors  than  those  made  using  time 
series  analysis  (for  which  the  average  error  was  92%  of  the  actual  datum). 
In  addition,     an  inspection  of  the  results  in  Table  2  indicates  that  5  of 
14  experts  and  2  of  17  teachers  (one  teacher  failed  to  complete  this  chart) 
were  less  biased  in  their  predictions  than  time  series  analysis.  In 
addition,  2  of  14  experts  performed  better  on  both  criteria  than  did  time 
series  analysis. 

The  results  presented  in  Table  3  indicate  that  predictive  performance 
was  more  accurate  for  both  subjects  and  time  series  analysis  on  data  set 
B  than  for  previous  data  sets.    One  expert  and  no  teachers  made  smaller 
errors  on  the  average  than  the  19%  average  error  of  time  series  analysis. 
In  terms  of  bias,  4  of  14  experts  and  5  of  18  teachers  were  less  biased 
than  the  10%  average  overestimate  of  time  series  analysis  (as  indicated 
by  the  asterisks).    One  expert  outperformed  time  series  analysis  in  both 
categories . 

Using  data  set  C  (see  Table  4)  no  subject  outperformed  time  series 
analysis  in  terms  of  average  error,  with  five  experts  and  three  teachers 


Table  2 

Prediction  Summaries  for  Data  Set  A  Using  Log  Transformed  Frequencies 


Time  Series 


Experts 
subj  ect 


1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 


Teachers 


subject  1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 


exp  (mean  absolute  error) 
1.918155 


2.220905 

2.947776 

1.914378  * 

1.684879  * 

2.296836 

2.559500 

1.901524  * 

2.397485 

2.588425 

3.904874 

1.860674  * 

2.128710 

2.244102 

3.063168 


2.263550 
1.974352 
2.553388 
2.881149 
2.719420 
2.368190 
2.288722 
2.776713 
3.522707 
2.533826 
2.162061 
2.633464 
3.049600 
3.160653 
2.841344 
2.813842 
3.118232 


exp  (bias) 
X  1.440351 


1.399083 
1.503573 
1.656237 
1.187029 


1 


682190 
287074 
159363 
804704 
1.694615 
1.119243 
525432 
1.700974 
1.657139 
2.691758 


2.208320 
1.48855S 
1.137466 
1.486582 
1.314844 
1.596056 
1.765719 
1.46013S 
1.456245 
1.289703 
1.670850 
1.968377 
2.238706 
3.002349 
1.583447 
1.026940 
1.65680S 


missing  data 
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Table  3 

Prediction  Summaries  for  Data  Set  B  Using  Log  Transformed  Data 


Time  Series 


Experts 
subj  ect 


1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 


Teachers 


subject  1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 


exp  (mean  absolute  error) 
1.189367 


1.447935 
1.412801 
1.342210 
1.366352 
1.274181 
1.086602 
1.360186 


250084 
319934 
625279 


1.465055 


509063 
392398 
705057 


1.331830 


1.220508 
1.568404 
1.459188 
1.357976 
1.622724 
1.250546 
1.386668 
1.350939 
1.467412 
1.379039 
1.305436 
1.422930 
1.344293 
1.380015 
1.319660 
1.225965 
1.229897 


exp  (bias) 
X  1.102065 


1.166851 
1.328319 
1.147242 
1.201611 
1.038704 
1.026981 
1.131799 
1.184792 
1.029755 
1.080062 
1.183453 
1.273203 
1.089320 
1.705057 


1.331830 
1.182050 
1.260415 
1.031260 
1.129960 
1.286826 
1.099767 
1.153834 
1.091271 
1.001725 
1.579039 
1.052258 
1.422930 
1.301933 
1 . 297495 
1.319660 
1.146721 
1.045519 
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Table  4 

Prediction  Summaries  for  Data  Set  C  Using  Log  Transfomed  Data 


Time  Series 


exp  (mean  absolute  error) 
1.073772 


exp  (bias) 
X  1.036881 


Experts 
subject 


1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 


1.212485 
1.257148 
1.151845 
1.123528 
1.293172 
1.353872 
1.357244 


1.127060 
1.150000 
1.503944 
1.332999 
1.261749 
1.237493 
1.236458 


1.000782 
1.038440 
1.124264 
1.021843 
1.055477 
1.269334 
1.100346 
1.077311 
1.136716 
107388 
028169 
233609 
029895 
008273 


Teachers 


subject  1 

1.195654 

x  1.135511 

2 

1.549413 

X  1.549413 

3 

1.203098 

X  1.050881 

4 

1.444535 

X  1.011002  * 

5 

1.180770 

X  1.079344 

6 

1.218213 

X  1.089943 

7 

1.284465 

X  1.062605 

8 

1.209752 

X  1.075301 

9 

1.210800 

f  1.041793 

10 

1.229298 

X  1.172273 

11 

1.253014 

X  1.130927 

12 

1.377872 

X  1.013607  * 

13 

1.256987 

X  1.048501 

14 

1.262132 

X  1.200791 

15 

1.262292 

X  1.183473 

16 

1.357518 

X  1.099775 

17 

1.272437 

t  1.068324 

18 

1.115241 

X  1.029575 
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exhibiting  less  bias.    This  is  also  the  data  set  for  which  both  subjects' 
and  time  series  prediction  errors  and  bias  were  smallest. 

The  59%  average  error  in  time  series  predictions  for  data  set  D 
(Table  5)  was  larger  than  the  average  error  of  only  3  of  14  experts  and 
6  of  18  teachers.    However,  the  30%  bias  for  time  series  predictions 
placed  it  as  the  only  category  for  any  chart  where  a  majority  of  subjects 
were  closer  to  the  actual  data.     Predictions  for  10  of  14  experts  and 
13  of  18  teachers  were  less  biased  than  time  series  analysis  predictions, 
with  two  experts  and  five  teachers  outperforming  time  series  analysis 
in  both  categories. 

Analysis  of  Variance 

As  mentioned  in  the  methods  chapter.  Analyses  of  Variance  were  per- 
formed for  both  of  the  criterion  measures  given  above.    Multiple  regression 
was  used  to  adjust  each  effect  for  all  others  because  of  the  partial  im- 
balance in  the  data. 

Actually  both  adjusted  and  unadjusted  sums  of  squares  were  computed, 
and  only  in  the  case  of  the  residual  effect  of  a  given  data  set  on  the  pre- 
dictions for  the  next  data  set  was  there  enough  difference  to  alter  the 
results  of  the  £  test.     In  many  cases  there  was  almost  no  difference 
between  adjusted  and  unadjusted  sums  of  squares,  lending  additional  weight 
to  the  evidence  that  only  differences  between  charts  were  significant 
when  compared  to  the  variation  that  could  not  be  explained  by  the  combined 
effects  of  charts,  subjects,  positions  in  the  sequence,  group  membership, 
and  first  order  residual  effects.     For  exp  (mean  absolute  error),  these 
results  are  presented  in  Table  6. 

As  a  result  of  this  analysis,  the  geometric  mean  of  exp  (mean 
absolute  error)  was  computed  for  each  of  the  five  charts  across  all  32 
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Table  5 

Prediction  Summaries  for  Chart  D  Using  Log  Transformed  Data 


Time  Series 

Experts 
sub j  ect  1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 


exp  (mean  absolute  error) 
1.591691 


1.757442 
1.639570 
1.280789 
1.913204 
1.597610 
1.846571 
1.713049 
1.555265 
1.715243 
1.694619 
1.731731 
1.329797 
1.903826 
1.620793 


exp  (bias) 
X  1.303664 


1.256914 
1.031937 
1.022080 
1.445625 
1.017754 
1.342176 
1.102541 
1.333286 
1.326027 
1.298437 
1.151029 
1.018245 
1.222314 
1.099526 


Teachers 


subject  1 

1.624580 

X 

1.238550 

2 

] .493126 

* 

1.047586 

3 

1.399783 

* 

X 

1.124233 

4 

2.139074 

X 

1.393626 

5 

1.784387 

X 

1.081995 

6 

1.636663 

X 

1.253217 

7 

1.724634 

X 

1.228081 

8 

1.725732 

1.003623 

9 

1.395914 

* 

X 

1.024685 

10 

1.526994 

* 

X 

1.048048 

11 

1.693482 

X 

1.127849 

12 

1.800973 

X 

1.433590 

13 

1.654450 

X 

1.033982 

14 

2.062433 

X 

1.506814 

15 

1.801250 

X 

1.173518 

16 

1.895388 

X 

1.432164 

17 

1.451138 

* 

X 

1.394776 

18 

1.532400 

* 

X 

1.189853 
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subjects  with  the  results  given  in  Table  7.    Time  series  analysis 


predictions  were  more 

accurate  for  each  c 

hart  than 

the  average 

subjects. 

Analysis 

of  Variance 

Table  6 
for  Exp 

(Mean  Absolute  Error) 

Source 

SS 

df 

MS 

p 

Charts  (adjusted) 

28.55415 

3  9 

. 51805 

123  96522  * 

Subjects  (adjusted) 

2.85152 

31 

.09198 

1.98027 

Position  (adjusted) 

. 16585 

3 

.05528 

.72002 

Group  Membership 
(adjusted) 

00000 

1 

.00000 

.00000 

Residual  Fffects 
(adjusted) 

.03365 

3 

.01122 

.14609 

Regressi  on^ 

35.47110 

41 

.86515 

11.26743  * 

Residual 

6.52657 

85 

.07678 

Total 

41.99767 

The  regression  sum  of  squares  was  greater  than  the  total  of 
the  adjusted  main  effects  (31.60517)  because  it  represents 
variation  attributable  to  the  combined  set  of  variables. 

*  Indicates  significance  at  the  p  <  .01  level. 

Table  7 

Comparison  of  the  Geometric  Mean  of  Exp  (Mean  Absolute  Error) 
Across  All  Subjects  with  the  Results  of  Time  Series  Analysis 

Chart  1  A  B  C  D 

Time  Series  2.276410        1.918155        1.189367        1.073772  1.591691 

Average  of  2.389578        2.513402        1.374759        1.261165  1.665076 

Subjects 

As  already  mentioned,  a  similar  analysis  of  variance  was  run  with 
exp  (bias) as  the  dependent  variable.    The  numbers  entered  were 
ITlog  (predicted)  -  log  (actual) ]/N  without  exponentiation,  since  the 
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sign  of  the  bias  for  these  numbers  was  plus  or  minus  and  hence  more 
appropriate  for  analysis  of  variance  then  the  multiply  or  divide  sign 
of  the  exponentiated  measure.    For  this  analysis,  a  check  for  possible 
non-additivity  in  the  data  was  also  obtained  by  plotting  residuals  against 
predicted  values.    The  plot  indicated  that  the  assumptions  for  the 
analysis    of  variance  were  not  violated. 

Results  presented  in  Table  8  for  the  analysis  of  variance  are  quite 
similar  to  those  for  exp  (mean  absolute  error). 


Table  8 

Analysis  of  Variance  for  Exp  (Bias) 


Soiirce 

SS 

df 

MS 

F 

Charts  (adjusted) 

2.4999 

3 

.8333 

25.30519 

Subjects  (adjusted) 

.90616 

31 

.02923 

.88767 

Position  (adjusted) 

.06436 

3 

.02145 

.65148 

Group  Membership 
(adjusted) 

.00000 

1 

.00000 

.00000 

Residual  Effects 
(adjusted) 

.01475 

3 

.00492 

.14931 

Regression^ 

3,89691 

41 

.09505 

2.88594 

Residual 

2.79941 

85 

.03293 

Total 

6.69632 

"'The  regression  sum  of  squares  was  greater  than  the  total 
of  tlie  adjusted  main  effects  (3.48517)  because  it  represents 
variation  attributable  to  the  combined  set  of  variables. 

*  Indicates  significance  at  the  p  <  .01  level. 

Although  a  smaller  proportion  of  the  total  variation  was  attributable 
to  the  combined  effect  of  all  factors  than  for  exp  (mean  absolute  error), 
the  main  effect    of  charts  was  again  clearly  the  important  factor.  Since 
other  effects  were  nonsignificant,  it  was  again  appropriate  to  average 
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across  all  subjects  for  comparison  with  the  results  of  time  series 
analysis.    However,  for  this  comparison  of  bias  in  subjects'  predictions 
with  bias  in  time  series  analysis  predictions,  it  was  the  magnitude  of 
the  bias  rather  than  its  sign  which    was  of  interest;  for  this  reason 
results  presented  in  Table  9  were  computed  as  the  geometric  mean  of  the 
magnitude  of  exp  (bias) . 

Table  9 

Comparison  of  the  Average  Magnitude  of  Exp  (Bias)  Across 
All  Subjects  with  the  Results  of  Time  Series  Analysis 

Chart  1  A  B  C  D 

Time  Series      1.120382        1.440351        1.102065       1.036881  1.303664 

Average  of       1.420431        1.577954       1.183027       1.098021  1.190850 
Sub j  ects 

As  ivith  the  comparisons  of  each  person's  prediction  bias  with  time 
series  prediction  bias  (Tables  1-5),  only  on  data  set  D  did  the  average 
across  subjects  indicate  more  accurate  prediction  for  subjects  than  for 
time  series  analysis. 

Hotel ling's 
2 

The  results  of  Hotelling's  T_  tests  for  each  of  the  five  charts  using 

the  log  transformed  data  are  presented  in  Table  10  with  (1)  actual  data 

as  the  criterion  for  comparison  with  subjects'  predictions  and  (2)  time 

series  analysis  predictions  as  the  criterion.    Values  given  are  the  F 

2 

equivalent  to  the  T  statistic  since  it  is  familiar,  with  the  conclusion 
being  that  both  the  actual  data  and  time  series  analysis  predictions  are 
clearly  distinct  from  subjects'  predictions. 
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F  Values  for  the  Null 
the  Mean  Vector  of  Su 

Chart  1 

Subjects'  predictions  55.86 
compared  to  actual 
data 

Subjects'  predictions  29.25 
compared  to  time 
series  predictions 

Degrees  of  freedom  10,22 


Table  10 

Ily-pothesi?  of  No  Difference  Between 

ijects'  Predictions  and  Two  Criteria 

A             B  C  D 

102.21      13.14  14,28  68.97 

21.45        8.71  11.94  30.69 

6,25        5,27  7,25  7,25 


NOTE:    The  elements  of  the  mean  vector  consisted  of  the  average  across 
all  subjects  of  predictions  for  each  day  (see  Appendix  C) . 


CHAPTER  V 
DISCUSSION 


This  study  examined  the  use  of  time  series  analysis  in  combination 
with  regression  techniques  for  the  description  of  behavioral  data.  The 
accuracy  with  which  the  resulting  descriptions  predicted  future  levels  of 
behavior  provided  the  criterion.    Predictions  using  time  series  models 
were  compared  to  predictions  by  two  groups  of  subjects  to  whom  the  data 
sets  were  displayed  on  the  Standard  Behavior  Chart. 

Summary  of  the  Results 

Measures  of  the  average  error  (of  primary  interest)  and  of  the  bias 
in  prediction  (of  secondary  interest)  were  computed  for  each  person-data 
set  combination  on  the  log  transformed  predictions.     In  all  cases  except 
the  bias  measure  for  data  set  D,  predictions  made  using  time  series  analysis 
techniques  were  more  accurate  than  predictions  by  the  majority  of  subjects. 
Comparisons  shov/ing  greater  accuracy  for  the  time  series  analysis  pre- 
dictions ranged  from  18  of  32  for  average  error  on  data  set  1  to  32  of  32 
for  average  error  on  data  set  C. 

Analysis  of  Variance  for  each  of  the  two  dependent  variables  allowed 
estimation  of  the  effects  on  subjects'  predictions  of  (1)  position  in  the 
sequence  of  presentation,   (2)  first  order  residual  effects  of  one  chart 
on  the  predictions  for  the  next  chart,  and  (3)  group  membership.  Results 
showed  that  none  of  these  effects  was  significant  for  either  dependent 
variable,  and  in  fact,  indicated  the  appropriateness  of  averaging  across 
all  subjects  for  each  chart. 

In  all  five  of  the  resulting  comparisons  for  average  size  of  error, 
time  series  predictions  were  more  accurate  than  the  average  across  subjects 
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In  all  five  of  the  resulting  comparisons  for  the  summary  measure  of 
errors  in  prediction,  time  series  predictions  were  more  accurate  than  the 
average  across  subjects  for  the  summary  measure.    For  the  bias  measure, 
time  series  analysis  predictions  were  less  biased  than  the  average  of  bias 
measures  across  subjects  for  four  of  five  charts.    The  exception  uas  data 
set  D,  for  which  the  average  of  bias  measures  for  subjects  showed  less 
bias  than  time  series  predictions.    These  results  were  presented  in  Tables 
7  and  9  and  their  practical  significance  is  discussed  below, 

2 

Because  an  average  had  been  taken  across  subjects,  Hotelling's  T 
statistic    was  used  to  perform  two  tests.    These  tests  measured  the  extent 
to  which  an  average  of  predictions  across  subjects,  for  each  of  the  pre- 
dicted days  for  a  given  data  set,  would  be  representative  of  predictions 
made  by  all  32  subjects.     If  variability  around  the  mean  vector  of 
subjects'  predictions  is  small  in  comparison  to  the  differences  between 
that  mean  vector  and  a  vector  of  criterion  values,  then  the  value  of  the 
test  statistic  will  be  significant.    The  results  indicated  that  variabil- 
ity around  the  vector  of  prediction  means  was  small  in  comparison  to  the 
difference  between  that  vector  and  the  vector  of  either  (1)  the  actual 
data  or  (2)  the  time  series  predictions. 

Generality  of  the  Results 
For  this  particular  set  of  charts  and  subjects,  it  seems  clear  that 
predictions  made  using  time  series  analysis  are  more  accurate  than  subjects' 
predictions.    However,  the  generality  of  this  result  must  be  qualified 
in  several  ways. 

Standarization  of  Descript ion 

First,  generalization  of  the  results  of  this  study  must  be  qualified 
because  the  time  series  analysis  techniques  used  are  not  a  cookbook  set 
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of  operations  that  require  only  the  feeding  in  of  a  data  set.  Human 
judgment  is  required  at  several  stages,  and  there  is  no  reason  to  assume 
that,  given  a  standard  set  of  data,  all  persons  would  arrive  at  exactly 
the  same  model. 

However,  this  assumed  lack  of  absolute  standardization  is  insignifi- 
cant when  compared  to  the  differences  among  visual  analyses  exhibited 
by  the  subjects  for  this  study.     Without  even  considering  the  magnitude 
of  the  slopes,  subjects  for  this  study  evidenced  considerable  disagreement 
as  to  the  direction  of  the  celeration  line  on  all  charts  except  A. 
Approximately  10%  of  subjects  drew  no  celeration  line,  while  about  15% 
drew  lines  with  a  1.0  celeration  (no  slope).    Of  the  remaining  subjects, 
those  assigning  the  opposite  sign  from  that  of  the  majority  ranged  from 
20%  to  27%  of  the  total. 

Also,  results  from  the  identification  of  periodicity  were  even  less 
uniform.    The  number  of  days  in  the  periods  identified  varied  from  4  to 
60  days,  infrequently  coinciding  with  identifications  by  other  subjects 
or  time  series  analysis  identifications  (7,  5,  7,  and  3.5  days  for  charts 
A,  B,  C,  and  D  respectively).     In  addition,  several  subjects  made  no 
identifications  at  all. 

Limitations  on  the  Use  of  Time  Series  Analysis 

As  a  second  restriction,  the  type  of  data  which  must  be  available 
for  the  proper  use  of  time  series  analysis  techniques  constitutes  a  much 
more  substantial  limitation  on  the  generality  of  the  results  of  this 
study.     In  fact,  time  series  analysis  is  not  appropriate  for  even  a 
majority    of  behavioral  data  for  several  reasons. 
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Homogeneity  of  variance  assumption.     First,  the  Box-Jenkins  model, 
with  its  attendant  assumption  of  stationarity ,  implies  homogeneity  of 
variance  in  the  a^'s.     If  it  is  possible  to  reduce  variance  in  the  later 
part  of  a  time  series  through  increased  experimental  control,  then  the 
techniques  of  model  building  used  for  this  dissertation  are  inappropriate. 
In  the  preferred  case  where  sufficient  experimental  control  is  attained, 
statistical  analysis  may  also  become  unnecessary. 

Quantity  of  data  needed.    Even  more  likely  to  be  abused  is  the  re- 
quirem.ent  of  sufficient  data  to  identify  correctly  the  Box-Jenkins 
model.     For  example,  Jones,  Vaught,  and  Weinrott  (1977),  in  a.dvocating 
time  series  analysis  of  behavioral  data,  have  reanalyzed  data  from  Baer, 
Rowbury,  and  Baer  (1973)  with  as  few  as  four  days  in  a  data  set,    A  more 
realistic  minimum  to  allow  identification  of  the  correct  model  is  35  to 
40  data  points  for  eacli  separate  set  of  environmental  conditions.  This 
quantity  of  data  is  especially  important  in  view  of  the  wide  variety  of 
models  fit  to  the  data  sets  for  this  dissertation.     This  requirement 
restricts  the  generality  of  the  results  of  this  study  because  it  is  in 
direct  contradiction  to  the  tactics  for  behavioral  change  in  academic  tasks 
which  are  utilized  by  many  precision  teachers.     If  an  arrangement  works, 
the  aim  will  probably  be  reached  before  enough  data  are  accumulated  to  do 
time  series  analysis.     If  an  arrangement  does  not  work,  a  new  arrangement 
of  environmental  conditions  should  be  initiated  before  five  or  more  weeks 
have  passed.    As  a  result,  time  series  analysis  is  not  appropriate  for  use 
with  much  precision  teaching  data. 

Human  behavioral  research  provides  a  more  likely  setting  for  the 
gathering  of  data  appropriate  for  time  series  analysis.    Here,  great 
emphasis  is  placed  on  the  proper  use  of  steady  states  as  a  crucial 
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feature  of  proper  experimentation.    Because  of  this  emphasis,  repeated 
observations  should  be  made  from  a  point  in  time  considerabley  before  a 
treatment  intervention  until  after  the  treatment  effects  have  stablized. 
Although  Johnston  and  Penn>'packer  (in  press)  were  hesitant  to  define 
precisely  the  steady  state,  they  did  indicate  that  it  should  exhibit  "a 
lack  of  relatively  systematic  change  in  the  data  in  an  increasing  or 
decreasing  direction"  and  "a  constant  range  of  variability."  Criteria 
in  current  use  are  admittedly  crude,  with  visual  inspection  most  often 
used.    However,  it  may  be  that  time  series  analysis  techniques  afford  the 
careful  researcher,  with  sufficient  data,  opportunities  for  far  more 
stringent  and  precise  criteria  of  stability.     For  example,  the  asymptotic 
multivariate  normality  of  the  parameter  estimates  mentioned  above  makes 
possible  a  statement  of  the  probability  that  collecting  another  set  of 
data  under  identical  conditions  would  result  in  a  different  decision  as 
to  the  presence  of  trends  in  the  data. 

Missing  data.    As  a  third  restriction  on  data  for  the  use  of  time 
series  analysis,  it  should  be  noted  that  missing  data  within  the  span  of 
a  time  series  are  problematic.    The  fact  that  all  five  data  sets  used  for 
this  research  are  self-report  data  is  due  primarily  to  this  requirement, 
since  data  for  many  charts,  especially  those  based  on  short  timings,  are 
kept  for  at  most  five  days  a  week.    This  restriction  is,  therefore,  also 
indirectly  responsible  for  the  fact  that  all  five  data  sets  are  recorded 
in  the  lower  cycles  of  the  Standard  Behavior  Chart.    This  is  because 
upper  cycle  charts  tend  to  be  those  based  on,  for  example,  one  minute 
timings  of  academic  behavior  recorded  two  to  five  times  per  week. 
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Supplementary  Analyses 
It  is  possible  to  ask  a  number  of  subsidiary  questions  of  these  data 
for  which  there  are  no  experimental  answers,  but  for  which  there  is 
some  information  available. 

Visual  Effects  of  Low  Cycle  Charts 

On  only  29  of  159  person-chart  combinations  persented  in  Tables  1  to 
5  were  bias  measures  assigned  the  7  sign,  indicating  that  estimates  tended 
to  be  below  the  actually  occurring  data.     It  can  be  conjectured  that  the 
occurrence  of  the  data  sets  toward  the  bottom    of  the  Standard  Behavior 
Chart  resulted  in  a  visual  tendency  to  place  dots  above  trends  in  the  data; 
however,  time  series  analysis  techniques  were  also  biased  upward  for  all 
five  charts  and  were  certainly  not  subject  to  visual  effects.     For  some 
charts,  a  more  likely  explanation  is  that  the  data  which  actually  occurred 
on  predicted  days  were  lower  than  the  "expected"  levels  in  view  of  general 
trends  in  the  data. 

Another  possible  explanation  related  to  the  placement  of  all  five 
data  sets  in  the  lower  cycle  of  the  Standard  Behavior  Chart  is  that  those 
subjects    experienced  with  this  type  of  chart  perform  better  than  those 
whose  experience  is  mainly  with  upper  cycle  charts.    However,  of  ten  sub- 
jects whose  predictions  compared  most  favorably  with  the  results  of  time 
series  analysis  predicitons,  five  have  worked  primarily  with  top  cycle 
charts  and  another  three  are  well  versed  in  both  top  and  bottom  cycle  charts. 

Experience  as  a  Controlling  Variable  for  Subjects'  Predictions 

Another  possible  controlling  variable  is  the  number  of  years  of 
experience  with  the  Standard  Behavior  Chart.    However,  experience  does  not 
distinguish  those  whose  predictions  compared  more  favorably  with  time  series 
analysis  predictions  from  those  who  fared  less  well  (this  information  is 
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almost  synonyiTious  with  the  grouping  for  Analysis  of  Variance).    A  similar 
suggestion  that  the  amount  of  recent  experience  with  charted  data  might 
be  correlated  with  prediction  performance  does  not  seem  to  be  supported 
by  the  results  of  this  research,  but  it  was  not  possible  to  check  this 
rigorously  from,  the  information  available. 

Relations  between  Chart  Characteristics  and  Subjects'  Predictions 

An  hypothesis  that  does  gain  some  support,  but  not  experimental 
confirmation,  from  the  results  of  this  study  is  that  errors  in  prediction 
will  tend  to  be  greater  for  charts  with  greater  variation  around  the 
trend  line.    Charts  1  and  A  (Appendix  A)  show  greater  variation  than  the 
other  three  charts  and  also  occasioned  larger  errors  in  prediction  fsee 
Table  7).    The  regression  weights  from  the  Analysis  of  Variance  for  these 
charts  confirm  this  relationship  with  a  large  positive  weight  for  Chart  A 
contrasting  to  negative  weights  (indicating  smaller  errors  since  the 
measure  of  average  error  was,  by  definition,  positive  in  value)  for  Charts 
B,  C,  and  D.    Also,  similar  information  is  supplied  in  Appendix  C,  with 
additional  information  indicating  that  greater  variability  around  the 
regression  line  occasions  greater  variability  among  subjects  in  the  errors 
in  prediction  for  a  given  day. 

However,  when  comparing  the  average  size  of  prediction  errors  for 
Chart  D  with  results  for  Charts  B  and  C,  which  exhibit  similar  variability, 
the  pattern  of  larger  errors  for  more  variable  charts  does  not  hold.  The 
larger  errors  for  Chart  D  may  be  attributable  to  errors  in  prediction  for 
the  sixth  day,  whose  actual  datum  is  lower  than  that  for  any  of  the  days 
used  as  a  bcisis  for  prediction.    Alternatively,  the  larger  errors  for 
Chart  D  may  be  taken  as  support  for  another  hypothesized  effect.    In  this 
hypothesis,  the  larger  errors  in  prediction  would  be  attributed  to  the 
presence  of  a  strong  cyclical  pattern  in  the  data  (the  sine  v&ve  fit  to 


51 


Chart  D  indicates  an  extremely  regular  cyclical  pattern  in  the  data) . 
The  fact  that  subjects'  errors  for  Chart  A  are  larger  than  for  Chart  1 
lends  additional  support  to  this  hypothesis,  since  Chart  A  data  contained 
a  weekly  pattern. 

Conclusions 

Time  series  analysis,  in  conjunction  with  regression  analysis,  pro- 
vides a  precise  set  of  techniques  for  detecting  and  describing  structure 
in  those  behavioral  data  sets  for  which  its  use  is  appropriate.  The 
resulting  models  account  for  more  of  the  variability  than  do  visual 
analyses,  and  if  it  is  accepted  that  "the  overriding  and  unifying  task  of 
science  is  to  account  for,  or  explain,  variability"  (Johnston  and 
Pennypacker,  in  press),  then  we  have  in  time  series  analysis  a  useful 
tool.     Furthermore,  predictions  made  using  time  series  analysis  are  more 
accurate  than  thos?  made  by  visual  analysts  using  the  Standard  Behavior 
Chart . 

The  amount  of  time  involved  in  time  series  analysis  and  the  restric- 
tions on  data  sets  for  which  its  use  is  appropriate  preclude  the  wide- 
spread use  of  time  series  analysis  among  applied  practitioners.  However, 
the  results  obtained  here  warrant  serious  consideration  of  time  series 
analysis  by  those  engaged  in  human  behavioral  research.     In  fact,  this 
is  already  happening.     Glass,  Willson,  and  Gottman  (1975)  have  published 
an  important  new  book  on  time  series  intended  for  behaviorists  as  well  as 
others.    Also,  Kazdin  (1976)  has  recently  authored  an  invited  chapter  for 
Hersei;  and  Barlow  on  "Statistical  Analysis  for  Single-case  Experimental 
Designs'"  with  the  longest  section  discussing  time  series  analysis  at  a 
non-technical  level. 

Given  that  time  series  analysis  enables  valid  and  precise  description 
of  a  single  data  set,  it  follows  that  increased  precision  is  available 
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for  the  comparison  of  two  or  more  time  series  data  sets.    IVliere  both 
time  series  data  sets  represent  different  phases  of  the  same  experiment, 
measures  such  as  those  described  by  Pennypacker  et  al.(1972)  can  be 
computed  precisely. 

IVhere  two  simultaneous  sets  of  time  series  data  have  been  collected, 
transfer  function  modeling  as  discussed    by  Box  and  Tiao  (1975)  can  be 
used  to  precisely  describe  the  relationship  between  tvjo  time  series. 
Multiple  baseline  designs  would  be  an  important  area  for  application  of 
transfer  function  modeling,  as  would  situations  where  time  series  data  are 
available  on  both  one  or  more  independent  variables  and  a  dependent 
variable.    This  latter  situation  is  essentially  the  situation  for  which 
these  techniques  were  developed,  where  the  goal  was  to  model  the 
relationship  between  input  and  output  in  an  industrial  process,  electrical 
circuit,    or  mechanical  system. 

Although  certain  assumptions  are  still  necessary  for  the  resulting 
analysis  to  be  appropriate,  the  need  for  considerable  data  may,  with 
transfer  function  modeling,  be  met  by  utilizing  data  from  several  phases 
of  an  experiment.    The  precision  which  will  then  be  available  for  describing 
the  functional  relationship  between  changes  in  the  environment  and  result- 
ing changes  in  behavior  represents  an  important  potential  contribution  of 
time  series  analysis  to  human  behavioral  research. 
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Table  11 

Autoregression  Coefficients  for  Regression  Residuals 

Data  Set  1 


Lag 

Coefficient 

Lag 

Coef ficie 

1 

.1820 

9 

.0603 

2 

-.0111 

10 

.1710 

3 

-.1576 

11 

.1722 

4 

-.0838 

12 

.0046 

5 

.0422 

13 

-,1948 

6 

-.0461 

14 

.0643 

7 

.0247 

15 

-.1520 

8 

-.0916 

16 

.0492 

60 
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Table  12 

Autoregression  Coefficients  for  Regression  Residuals 

Data  Set  A 


Lag 

Coefficient 

Lag 

Coefficient 

1 

.0033 

9 

-.0538 

2 

.0180 

10 

-.1658 

3 

-.0130 

11 

-.2038 

4 

-  1199 

12 

-.0006 

5 

.0428 

13 

-.0475 

6 

-.1044 

14 

.2538 

7 

.4361 

15 

.0131 

8 

.0400 

16 

-.0431 

G3 


Figure  7.    Spectral  Density  Function  for  Data  Set  A. 
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Table  13 

Autoregression  Coefficients  for  Regression  Residuals 

Data  Set  B 


Lag 

Coefficient 

Lag 

Coefficie: 

1 

.0790 

9 

.1150 

2 

-.2327 

10 

-.1876 

3 

-.2808 

11 

-.0690 

4 

-.0557 

12 

.0526 

5 

.1414 

13 

.1322 

6 

.0402 

14 

.1175 

7 

-.0370 

15 

-.1888 

8 

-.1580 

16 

-.0962 
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Figure  8.     Spectral  Density  Function  for  Data  Set  B. 
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Table  14 

Autoregression  Coefficients  for  Regression  Residuals 

Data  Set  C 


Lag 

Coefficient 

Lag 

Coefficient 

1 

.2464 

10 

-.1989 

2 

-.1181 

11 

-.2320 

5 

-.2288 

12 

-.2052 

4 

-.2472 

13 

.0631 

5 

-.3153 

14 

.3059 

6 

.0890 

15 

.0599 

7 

.4666 

16 

-.0826 

S 

.1482 

17 

-.1358 

9 

-.1477 

67 


1  2  4  8  16  62 

I'eriud  ill  Hays 
Figure  9.     Spectral  Density  Function  for  Data  Set  C. 
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Table  15 

Autoregression  Coefficients  for  Regression  Residuals 

Data  Set  U 


Lag 

Coefficient 

Lag 

Coeff  iciei 

1 

-.0098 

10 

.0808 

2 

-.1973 

11 

.1676 

3 

.0002 

12 

-.2428 

4 

.1774 

13 

- .0393 

5 

-.0829 

14 

.2060 

6 

-.1770 

15 

.0401 

7 

.1947 

16 

-.0440 

8 

-.1380 

17 

-.0184 

9 

-.1016 

18 

-.0179 
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1 


1  2  4  8  16  32 


I'erioJ   in  I^ays 


Figure  10.     Spectral  Density  Function  for  Data  Set  D. 


APPENDIX  C 
HISTOGRAMS  OF  SUBJECTS'  PREDICTIONS 
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bachelor's  degree  from  Yale  University  and  in  1974  an  M.  Ed.  degree  from 
the  University  of  Florida. 

During  the  years  from  1970  to  1975  I  taught  music  in  the  public 
scliools,  for  one  year  at  Chiefland,  Florida,  and  for  four  years  in 
Gainesville. 

Geri  and  I  have  been  married  for  just  over  nine  years  and  have 
recently  moved  to  the  Boston  area,  where  I  am  currently  a  Research 
Associate  for  two  health  grants  at  Tufts  University. 
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conforms  to  acceptable  standards  of  scholarly  presentation  and  is 
fully  adequate,  in  scope  and  quality,  as  a  dissertation  for  the  degree 
of  Doctor  of  Philosophy. 


William  B.  Ware,  Chairman 
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